Skip to content

Conversation

@AdwitaSingh1711
Copy link

Summary:
Implements a 30-second warning and 300-second abort function mechanism as default for functions in src/execution/evaluator.rs. It also allows per-function overriding by specifying a timeout in decorator arguments, similar to the caching method.

Fixes #658

Examples:

  1. In examples/manuals_llm_extraction/main.py we can add @cocoindex.op.executor_class(timeout=15) to override current 300second timeout as shown
Screenshot 2025-11-02 at 8 09 34 AM
  1. Additionally, if any other function takes longer than 30 seconds to execute, a warning message is displayed. If the execution time exceeds 300 seconds, the function times out.

Checklist:

  • I have run all tests and pre-commit checks, which have passed.
  • Cache, logs and others are not accidentally added to git's tracking history.
  • My commits follow the conventional commits format.

let behavior_version = executor.behavior_version();
let timeout = executor.timeout()
.or(execution_options_timeout)
.or(Some(Duration::from_secs(300)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put the default value into a global const. And as a second thought, we may want to start from a larger value - use 1800 second for now (later after things more stable, I'll gradually reduce it).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added TIMEOUT_THRESHOLD and WARNING_THRESHOLD to both src/builder/analyzer.rs and src/execution/evaluator.rs


let op_name_for_warning = op.name.clone();
let op_kind_for_warning = op.op_kind.clone();
let warn_handle = tokio::spawn(async move {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need tokio::spawn.

tokio::select! is sufficient (example)

Comment on lines 394 to 401
eprintln!(
"WARNING: Function '{}' ({}) is taking longer than 30s",
op_kind_for_warning, op_name_for_warning
);
warn!(
"Function '{}' ({}) is taking longer than 30s",
op_kind_for_warning, op_name_for_warning
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need one. Let's keep warn! and get rid of eprintln!

Comment on lines -518 to -554

// Assemble input values
let input_values: Vec<value::Value> =
assemble_input_values(&op.input.fields, scoped_entries)
.collect::<Result<Vec<_>>>()?;

// Create field_values vector for all fields in the merged schema
let mut field_values = op
.field_index_mapping
.iter()
.map(|idx| {
idx.map_or(value::Value::Null, |input_idx| {
input_values[input_idx].clone()
})
})
.collect::<Vec<_>>();

// Handle auto_uuid_field (assumed to be at position 0 for efficiency)
if op.has_auto_uuid_field {
if let Some(uuid_idx) = op.collector_schema.auto_uuid_field_idx {
let uuid = memory.next_uuid(
op.fingerprinter
.clone()
.with(
&field_values
.iter()
.enumerate()
.filter(|(i, _)| *i != uuid_idx)
.map(|(_, v)| v)
.collect::<Vec<_>>(),
)?
.into_fingerprint(),
)?;
field_values[uuid_idx] = value::Value::Basic(value::BasicValue::Uuid(uuid));
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These code should be kept (merge mistake?). Please bring back.


pub struct RetryOptions {
pub retry_timeout: Option<Duration>,
pub per_call_timeout: Option<Duration>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per_call_timeout is not used. Why do we need this?

class _ExecutionOptions:
max_inflight_rows: int | None = None
max_inflight_bytes: int | None = None
timeout: int | None = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use datetime.timedelta | None here: prefer stronger type

batching: bool = False
max_batch_size: int | None = None
behavior_version: int | None = None
timeout: int | None = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, let's use datetime.timedelta | None

};
use futures::future::{BoxFuture, try_join3};
use futures::{FutureExt, future::try_join_all};
use tokio::time::Duration;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is an alias. Let's directly use std::time::Duration.

use futures::{FutureExt, future::try_join_all};
use tokio::time::Duration;

const TIMEOUT_THRESHOLD: u64 = 1800;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use Duration type here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Avoid a single function stuck forever

2 participants